Efficient, Feature-based, Conditional Random Field Parsing
نویسندگان
چکیده
Discriminative feature-based methods are widely used in natural language processing, but sentence parsing is still dominated by generative methods. While prior feature-based dynamic programming parsers have restricted training and evaluation to artificially short sentences, we present the first general, featurerich discriminative parser, based on a conditional random field model, which has been successfully scaled to the full WSJ parsing data. Our efficiency is primarily due to the use of stochastic optimization techniques, as well as parallelization and chart prefiltering. On WSJ15, we attain a state-of-the-art F-score of 90.9%, a 14% relative reduction in error over previous models, while being two orders of magnitude faster. On sentences of length 40, our system achieves an F-score of 89.0%, a 36% relative reduction in error over a generative baseline.
منابع مشابه
Shallow Discourse Parsing with Conditional Random Fields
Parsing discourse is a challenging natural language processing task. In this paper we take a data driven approach to identify arguments of explicit discourse connectives. In contrast to previous work we do not make any assumptions on the span of arguments and consider parsing as a token-level sequence labeling task. We design the argument segmentation task as a cascade of decisions based on con...
متن کاملPart of Speech Tagging and Shallow Parsing of Indian Languages
This paper describes and evaluates shallow parsing of several Indian languages utilizing Conditional Random Field models. We show how performance can be substantially improved by several feature enhancements and improved modeling techniques, including expanding the chunk tag inventory, and separating punctuation from linguistic phrases. We also report results from part of speech tagging of Hind...
متن کاملWeb Reviews and Events Matching Based on Event Feature Segments and Semi-Markov Conditional Random Fields
To establish links between a large number of reviews and events, we propose a web reviews and events matching approach by event feature segments and semi-Markov conditional random fields (CRFs). We extract named entities and verb phrases from reviews as event feature segments. We use semi-Markov CRFs to label the reviews and to recognize event feature segments at the segment level. This approac...
متن کاملShallow Parsing with Conditional Random Fields
Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluation datasets and extensive comparison among methods. We show here how to train a conditional random fiel...
متن کاملAnalysis and Enhancement of Conditional Random Fields Gene Mention Taggers in BioCreative II Challenge Evaluation
Background: Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In BioCreative 2 challenge, the conditional random fields model (CRF) was the most prevailing method in the gene mention task. In this paper, we analyze two best performing CRF-based systems in BioCreative 2. We examine their key claims and propose enhancement based on the an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008